asymptotically normal
- Africa > Cameroon (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.69)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.70)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
- Asia > Middle East > Jordan (0.04)
- Oceania > New Zealand (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Post-Contextual-Bandit Inference
Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-commerce, healthcare, and policymaking because they can both improve outcomes for study participants and increase the chance of identifying good or even best policies. To support credible inference on novel interventions at the end of the study, nonetheless, we still want to construct valid confidence intervals on average treatment effects, subgroup effects, or value of new policies. The adaptive nature of the data collected by contextual bandit algorithms, however, makes this difficult: standard estimators are no longer asymptotically normally distributed and classic confidence intervals fail to provide correct coverage. While this has been addressed in non-contextual settings by using stabilized estimators, variance-stabilized estimators in the contextual setting pose unique challenges that we tackle for the first time in this paper. We propose the Contextual Adaptive Doubly Robust (CADR) estimator, a novel estimator for policy value that is asymptotically normal under contextual adaptive data collection. The main technical challenge in constructing CADR is designing adaptive and consistent conditional standard deviation estimators for stabilization. Extensive numerical experiments using 57 OpenML datasets demonstrate that confidence intervals based on CADR uniquely provide correct coverage.
- Africa > Cameroon (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine (1.00)
- Information Technology > Services (0.48)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.69)
Local Polynomial Lp-norm Regression
Tazik, Ladan, Stafford, James, Braun, John
The local least squares estimator for a regression curve cannot provide optimal results when non-Gaussian noise is present. Both theoretical and empirical evidence suggests that residuals often exhibit distributional properties different from those of a normal distribution, making it worthwhile to consider estimation based on other norms. It is suggested that $L_p$-norm estimators be used to minimize the residuals when these exhibit non-normal kurtosis. In this paper, we propose a local polynomial $L_p$-norm regression that replaces weighted least squares estimation with weighted $L_p$-norm estimation for fitting the polynomial locally. We also introduce a new method for estimating the parameter $p$ from the residuals, enhancing the adaptability of the approach. Through numerical and theoretical investigation, we demonstrate our method's superiority over local least squares in one-dimensional data and show promising outcomes for higher dimensions, specifically in 2D.
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
A Computationally Efficient Method for Learning Exponential Family Distributions
We consider the question of learning the natural parameters of a k parameter \textit{minimal} exponential family from i.i.d. We focus on the setting where the support as well as the natural parameters are appropriately bounded. While the traditional maximum likelihood estimator for this class of exponential family is consistent, asymptotically normal, and asymptotically efficient, evaluating it is computationally hard. In this work, we propose a computationally efficient estimator that is consistent as well as asymptotically normal under mild conditions. We provide finite sample guarantees to achieve an ( \ell_2) error of \alpha in the parameter estimation with sample complexity O(\mathrm{poly}(k/\alpha)) and computational complexity {O}(\mathrm{poly}(k/\alpha)) .
Post-Episodic Reinforcement Learning Inference
Syrgkanis, Vasilis, Zhan, Ruohan
We consider estimation and inference with data collected from episodic reinforcement learning (RL) algorithms; i.e. adaptive experimentation algorithms that at each period (aka episode) interact multiple times in a sequential manner with a single treated unit. Our goal is to be able to evaluate counterfactual adaptive policies after data collection and to estimate structural parameters such as dynamic treatment effects, which can be used for credit assignment (e.g. what was the effect of the first period action on the final outcome). Such parameters of interest can be framed as solutions to moment equations, but not minimizers of a population loss function, leading to $Z$-estimation approaches in the case of static data. However, such estimators fail to be asymptotically normal in the case of adaptive data collection. We propose a re-weighted $Z$-estimation approach with carefully designed adaptive weights to stabilize the episode-varying estimation variance, which results from the nonstationary policy that typical episodic RL algorithms invoke. We identify proper weighting schemes to restore the consistency and asymptotic normality of the re-weighted Z-estimators for target parameters, which allows for hypothesis testing and constructing uniform confidence regions for target parameters of interest. Primary applications include dynamic treatment effect estimation and dynamic off-policy evaluation.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Hong Kong (0.04)
Falsification before Extrapolation in Causal Effect Estimation
Hussain, Zeshan, Oberst, Michael, Shih, Ming-Chieh, Sontag, David
Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and lack data on broader populations of interest. Causal effects in these populations are often estimated using observational datasets, which may suffer from unobserved confounding and selection bias. Given a set of observational estimates (e.g. from multiple studies), we propose a meta-algorithm that attempts to reject observational estimates that are biased. We do so using validation effects, causal effects that can be inferred from both RCT and observational data. After rejecting estimators that do not pass this test, we generate conservative confidence intervals on the extrapolated causal effects for subgroups not observed in the RCT. Under the assumption that at least one observational estimator is asymptotically normal and consistent for both the validation and extrapolated effects, we provide guarantees on the coverage probability of the intervals output by our algorithm. To facilitate hypothesis testing in settings where causal effect transportation across datasets is necessary, we give conditions under which a doubly-robust estimator of group average treatment effects is asymptotically normal, even when flexible machine learning methods are used for estimation of nuisance parameters. We illustrate the properties of our approach on semi-synthetic and real world datasets, and show that it compares favorably to standard meta-analysis techniques.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Taiwan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)